110 research outputs found

    Video Fragmentation and Reverse Search on the Web

    Get PDF
    This chapter is focused on methods and tools for video fragmentation and reverse search on the web. These technologies can assist journalists when they are dealing with fake news—which nowadays are being rapidly spread via social media platforms—that rely on the reuse of a previously posted video from a past event with the intention to mislead the viewers about a contemporary event. The fragmentation of a video into visually and temporally coherent parts and the extraction of a representative keyframe for each defined fragment enables the provision of a complete and concise keyframe-based summary of the video. Contrary to straightforward approaches that sample video frames with a constant step, the generated summary through video fragmentation and keyframe extraction is considerably more effective for discovering the video content and performing a fragment-level search for the video on the web. This chapter starts by explaining the nature and characteristics of this type of reuse-based fake news in its introductory part, and continues with an overview of existing approaches for temporal fragmentation of single-shot videos into sub-shots (the most appropriate level of temporal granularity when dealing with user-generated videos) and tools for performing reverse search of a video on the web. Subsequently, it describes two state-of-the-art methods for video sub-shot fragmentation—one relying on the assessment of the visual coherence over sequences of frames, and another one that is based on the identification of camera activity during the video recording—and presents the InVID web application that enables the fine-grained (at the fragment-level) reverse search for near-duplicates of a given video on the web. In the sequel, the chapter reports the findings of a series of experimental evaluations regarding the efficiency of the above-mentioned technologies, which indicate their competence to generate a concise and complete keyframe-based summary of the video content, and the use of this fragment-level representation for fine-grained reverse video search on the web. Finally, it draws conclusions about the effectiveness of the presented technologies and outlines our future plans for further advancing them

    StateLens: A Reverse Engineering Solution for Making Existing Dynamic Touchscreens Accessible

    Full text link
    Blind people frequently encounter inaccessible dynamic touchscreens in their everyday lives that are difficult, frustrating, and often impossible to use independently. Touchscreens are often the only way to control everything from coffee machines and payment terminals, to subway ticket machines and in-flight entertainment systems. Interacting with dynamic touchscreens is difficult non-visually because the visual user interfaces change, interactions often occur over multiple different screens, and it is easy to accidentally trigger interface actions while exploring the screen. To solve these problems, we introduce StateLens - a three-part reverse engineering solution that makes existing dynamic touchscreens accessible. First, StateLens reverse engineers the underlying state diagrams of existing interfaces using point-of-view videos found online or taken by users using a hybrid crowd-computer vision pipeline. Second, using the state diagrams, StateLens automatically generates conversational agents to guide blind users through specifying the tasks that the interface can perform, allowing the StateLens iOS application to provide interactive guidance and feedback so that blind users can access the interface. Finally, a set of 3D-printed accessories enable blind people to explore capacitive touchscreens without the risk of triggering accidental touches on the interface. Our technical evaluation shows that StateLens can accurately reconstruct interfaces from stationary, hand-held, and web videos; and, a user study of the complete system demonstrates that StateLens successfully enables blind users to access otherwise inaccessible dynamic touchscreens.Comment: ACM UIST 201

    Finding Semantically Related Videos in Closed Collections

    Get PDF
    Modern newsroom tools offer advanced functionality for automatic and semi-automatic content collection from the web and social media sources to accompany news stories. However, the content collected in this way often tends to be unstructured and may include irrelevant items. An important step in the verification process is to organize this content, both with respect to what it shows, and with respect to its origin. This chapter presents our efforts in this direction, which resulted in two components. One aims to detect semantic concepts in video shots, to help annotation and organization of content collections. We implement a system based on deep learning, featuring a number of advances and adaptations of existing algorithms to increase performance for the task. The other component aims to detect logos in videos in order to identify their provenance. We present our progress from a keypoint-based detection system to a system based on deep learning

    Loosely distinctive features for robust surface alignment

    Get PDF
    Many successful feature detectors and descriptors exist for 2D intensity images. However, obtaining the same effectiveness in the domain of 3D objects has proven to be a more elusive goal. In fact, the smoothness often found in surfaces and the lack of texture information on the range images produced by conventional 3D scanners hinder both the localization of interesting points and the distinctiveness of their characterization in terms of descriptors. To overcome these limitations several approaches have been suggested, ranging from the simple enlargement of the area over which the descriptors are computed to the reliance on external texture information. In this paper we offer a change in perspective, where a game-theoretic matching technique that exploits global geometric consistency allows to obtain an extremely robust surface registration even when coupled with simple surface features exhibiting very low distinctiveness. In order to assess the performance of the whole approach we compare it with state-of-the-art alignment pipelines. Furthermore, we show that using the novel feature points with well-known alternative non-global matching techniques leads to poorer results. © 2010 Springer-Verlag

    RGB-D Odometry and SLAM

    Full text link
    The emergence of modern RGB-D sensors had a significant impact in many application fields, including robotics, augmented reality (AR) and 3D scanning. They are low-cost, low-power and low-size alternatives to traditional range sensors such as LiDAR. Moreover, unlike RGB cameras, RGB-D sensors provide the additional depth information that removes the need of frame-by-frame triangulation for 3D scene reconstruction. These merits have made them very popular in mobile robotics and AR, where it is of great interest to estimate ego-motion and 3D scene structure. Such spatial understanding can enable robots to navigate autonomously without collisions and allow users to insert virtual entities consistent with the image stream. In this chapter, we review common formulations of odometry and Simultaneous Localization and Mapping (known by its acronym SLAM) using RGB-D stream input. The two topics are closely related, as the former aims to track the incremental camera motion with respect to a local map of the scene, and the latter to jointly estimate the camera trajectory and the global map with consistency. In both cases, the standard approaches minimize a cost function using nonlinear optimization techniques. This chapter consists of three main parts: In the first part, we introduce the basic concept of odometry and SLAM and motivate the use of RGB-D sensors. We also give mathematical preliminaries relevant to most odometry and SLAM algorithms. In the second part, we detail the three main components of SLAM systems: camera pose tracking, scene mapping and loop closing. For each component, we describe different approaches proposed in the literature. In the final part, we provide a brief discussion on advanced research topics with the references to the state-of-the-art.Comment: This is the pre-submission version of the manuscript that was later edited and published as a chapter in RGB-D Image Analysis and Processin

    Two-particle correlations in azimuthal angle and pseudorapidity in inelastic p + p interactions at the CERN Super Proton Synchrotron

    Get PDF
    Results on two-particle ΔηΔϕ correlations in inelastic p + p interactions at 20, 31, 40, 80, and 158 GeV/c are presented. The measurements were performed using the large acceptance NA61/SHINE hadron spectrometer at the CERN Super Proton Synchrotron. The data show structures which can be attributed mainly to effects of resonance decays, momentum conservation, and quantum statistics. The results are compared with the Epos and UrQMD models.ISSN:1434-6044ISSN:1434-605

    POLICY PREFERENCE FORMATION IN LEGISLATIVE POLITICS:STRUCTURES, ACTORS, AND FOCAL POINTS

    Get PDF
    This dissertation introduces and tests a model of policy preference formation in legislative politics. Emphasizing a dynamic relationship between structure, agent, and decision-making process, it ties the question of policy choice to the dimensionality of the normative political space and the strategic actions of parliamentary agenda-setters. The model proposes that structural factors, such as ideology, shape policy preferences to the extent that legislative specialists successfully link them to specific policy proposals through the provision of informational focal points. These focal points shift attention toward particular aspects of a legislative proposal, thus shaping the dominant interpretation of its content and consequences and, in turn, individual-level policy preferences. The propositions of the focal point model are tested empirically with data from the European Parliament (EP), using both qualitative (interview data, content analyses of parliamentary debates) and quantitative methods (multinomial logit regression analyses of roll-call votes). The findings have implications for our understanding of politics and law-making in the European Union and for the study of legislative decision-making more generally

    Imposing Semi-Local Geometric Constraints for Accurate Correspondences Selection in Structure from Motion: A Game-Theoretic Perspective

    Get PDF
    Most Structure from Motion pipelines are based on the iterative refinement of an initial batch of feature correspondences. Typically this is performed by selecting a set of match candidates based on their photometric similarity; an initial estimate of camera intrinsic and extrinsic parameters is then computed by minimizing the reprojection error. Finally, outliers in the initial correspondences are filtered by enforcing some global geometric property such as the epipolar constraint. In the literature many different approaches have been proposed to deal with each of these three steps, but almost invariably they separate the first inlier selection step, which is based only on local image properties, from the enforcement of global geometric consistency. Unfortunately, these two steps are not independent since outliers can lead to inaccurate parameter estimation or even prevent convergence, leading to the well known sensitivity of all filtering approaches to the number of outliers, especially in the presence of structured noise, which can arise, for example, when the images present several repeated patterns. In this paper we introduce a novel stereo correspondence selection scheme that casts the problem into a Game-Theoretic framework in order to guide the inlier selection towards a consistent subset of correspondences. This is done by enforcing geometric constraints that do not depend on full knowledge of the motion parameters but rather on some semi-local property that can be estimated from the local appearance of the image features. The practical effectiveness of the proposed approach is confirmed by an extensive set of experiments and comparisons with state-of-the-art techniques
    • …
    corecore